A Character-level Convolutional Neural Network for Distinguishing Similar Languages and Dialects
نویسندگان
چکیده
1. Competing features: AllbnAnyp, hmA, -An; verb-subject word order: AndlEt AlHrb 2. Mixing: >h HDrp, bdy vs hl syHdv 3. Morphology: Alm$kwk fyh 4. Word vs char: AlmAlky, AlhA$my 5. ASR mistakes: byt>vr vs byt>tr 6. Rare features: <HnA wyAhm. But: bqyt common in NA Results • 6/18 in sub-task 2; 2nd to last in sub-task 1 • Spanish most difficult, Malay/Indonesian easiest • Gulf most confusing Arabic dialect, MSA easiest (a) Sub-task 1, test set A, Run 3
منابع مشابه
Discrimination between Similar Languages, Varieties and Dialects using CNN- and LSTM-based Deep Neural Networks
In this paper, we describe a system (CGLI) for discriminating similar languages, varieties and dialects using convolutional neural networks (CNNs) and long short-term memory (LSTM) neural networks. We have participated in the Arabic dialect identification sub-task of DSL 2016 shared task for distinguishing different Arabic language texts under closed submission track. Our proposed approach is l...
متن کاملCharacter-Aware Neural Language Models
We describe a simple neural language model that relies only on character-level inputs. Predictions are still made at the word-level. Our model employs a convolutional neural network (CNN) over characters, whose output is given to a long short-term memory (LSTM) recurrent neural network language model (RNN-LM). On the English Penn Treebank the model is on par with the existing state-of-the-art d...
متن کاملAn efficient method for cloud detection based on the feature-level fusion of Landsat-8 OLI spectral bands in deep convolutional neural network
Cloud segmentation is a critical pre-processing step for any multi-spectral satellite image application. In particular, disaster-related applications e.g., flood monitoring or rapid damage mapping, which are highly time and data-critical, require methods that produce accurate cloud masks in a short time while being able to adapt to large variations in the target domain (induced by atmospheric c...
متن کاملDiscriminating between Similar Languages with Word-level Convolutional Neural Networks
Discriminating between Similar Languages (DSL) is a challenging task addressed at the VarDial Workshop series. We report on our participation in the DSL shared task with a two-stage system. In the first stage, character n-grams are used to separate language groups, then specialized classifiers distinguish similar language varieties. We have conducted experiments with three system configurations...
متن کاملEstimation of Hand Skeletal Postures by Using Deep Convolutional Neural Networks
Hand posture estimation attracts researchers because of its many applications. Hand posture recognition systems simulate the hand postures by using mathematical algorithms. Convolutional neural networks have provided the best results in the hand posture recognition so far. In this paper, we propose a new method to estimate the hand skeletal posture by using deep convolutional neural networks. T...
متن کامل